multiclass classifier
New Machine Learning Approaches for Intrusion Detection in ADS-B
Ngamboé, Mikaëla, Marrocco, Jean-Simon, Ouattara, Jean-Yves, Fernandez, José M., Nicolescu, Gabriela
With the growing reliance on the vulnerable Automatic Dependent Surveillance-Broadcast (ADS-B) protocol in air traffic management (ATM), ensuring security is critical. This study investigates emerging machine learning models and training strategies to improve AI-based intrusion detection systems (IDS) for ADS-B. Focusing on ground-based ATM systems, we evaluate two deep learning IDS implementations: one using a transformer encoder and the other an extended Long Short-Term Memory (xLSTM) network, marking the first xLSTM-based IDS for ADS-B. A transfer learning strategy was employed, involving pre-training on benign ADS-B messages and fine-tuning with labeled data containing instances of tampered messages. Results show this approach outperforms existing methods, particularly in identifying subtle attacks that progressively undermine situational awareness. The xLSTM-based IDS achieves an F1-score of 98.9%, surpassing the transformer-based model at 94.3%. Tests on unseen attacks validated the generalization ability of the xLSTM model. Inference latency analysis shows that the 7.26-second delay introduced by the xLSTM-based IDS fits within the Secondary Surveillance Radar (SSR) refresh interval (5-12 s), although it may be restrictive for time-critical operations. While the transformer-based IDS achieves a 2.1-second latency, it does so at the cost of lower detection performance.
CaliMatch: Adaptive Calibration for Improving Safe Semi-supervised Learning
Bae, Jinsoo, Kim, Seoung Bum, Do, Hyungrok
Semi-supervised learning (SSL) uses unlabeled data to improve the performance of machine learning models when labeled data is scarce. However, its real-world applications often face the label distribution mismatch problem, in which the unlabeled dataset includes instances whose ground-truth labels are absent from the labeled training dataset. Recent studies, referred to as safe SSL, have addressed this issue by using both classification and out-of-distribution (OOD) detection. However, the existing methods may suffer from overconfidence in deep neural networks, leading to increased SSL errors because of high confidence in incorrect pseudo-labels or OOD detection. T o address this, we propose a novel method, CaliMatch, which calibrates both the classifier and the OOD detector to foster safe SSL. CaliMatch presents adaptive label smoothing and temperature scaling, which eliminates the need to manually tune the smoothing degree for effective calibration. W e give a theoretical justification for why improving the calibration of both the classifier and the OOD detector is crucial in safe SSL. Extensive evaluations on CIF AR-10, CIF AR-100, SVHN, TinyImageNet, and ImageNet demonstrate that CaliMatch outperforms the existing methods in safe SSL tasks.
Attention-Based Methods For Audio Question Answering
Sudarsanam, Parthasaarathy, Virtanen, Tuomas
Audio question answering (AQA) is the task of producing natural language answers when a system is provided with audio and natural language questions. In this paper, we propose neural network architectures based on self-attention and cross-attention for the AQA task. The self-attention layers extract powerful audio and textual representations. The cross-attention maps audio features that are relevant to the textual features to produce answers. All our models are trained on the recently proposed Clotho-AQA dataset for both binary yes/no questions and single-word answer questions. Our results clearly show improvement over the reference method reported in the original paper. On the yes/no binary classification task, our proposed model achieves an accuracy of 68.3% compared to 62.7% in the reference model. For the single-word answers multiclass classifier, our model produces a top-1 and top-5 accuracy of 57.9% and 99.8% compared to 54.2% and 93.7% in the reference model respectively. We further discuss some of the challenges in the Clotho-AQA dataset such as the presence of the same answer word in multiple tenses, singular and plural forms, and the presence of specific and generic answers to the same question. We address these issues and present a revised version of the dataset.
Constructing Multiclass Classifiers using Binary Classifiers Under Log-Loss
Ben-Yishai, Assaf, Ordentlich, Or
The construction of multiclass classifiers from binary classifiers is studied in this paper, and performance is quantified by the regret, defined with respect to the Bayes optimal log-loss. We start by proving that the regret of the well known One vs. All (OVA) method is upper bounded by the sum of the regrets of its constituent binary classifiers. We then present a new method called Conditional OVA (COVA), and prove that its regret is given by the weighted sum of the regrets corresponding to the constituent binary classifiers. Lastly, we present a method termed Leveraged COVA (LCOVA), designated to reduce the regret of a multiclass classifier by breaking it down to independently optimized binary classifiers.
Binary and Multiclass Classifiers based on Multitaper Spectral Features for Epilepsy Detection
Oliva, Jefferson Tales, Rosa, João Luís Garcia
Epilepsy is one of the most common neurological disorders that can be diagnosed through electroencephalogram (EEG), in which the following epileptic events can be observed: pre-ictal, ictal, post-ictal, and interictal. In this paper, we present a novel method for epilepsy detection into two differentiation contexts: binary and multiclass classification. For feature extraction, a total of 105 measures were extracted from power spectrum, spectrogram, and bispectrogram. For classifier building, eight different machine learning algorithms were used. Our method was applied in a widely used EEG database. As a result, random forest and backpropagation based on multilayer perceptron algorithms reached the highest accuracy for binary (98.75%) and multiclass (96.25%) classification problems, respectively. Subsequently, the statistical tests did not find a model that would achieve a better performance than the other classifiers. In the evaluation based on confusion matrices, it was also not possible to identify a classifier that stands out in relation to other models for EEG classification. Even so, our results are promising and competitive with the findings in the literature.
Robustness from Simple Classifiers
Qian, Sharon, Kalimeris, Dimitris, Kaplun, Gal, Singer, Yaron
Despite the vast success of Deep Neural Networks in numerous application domains, it has been shown that such models are not robust i.e., they are vulnerable to small adversarial perturbations of the input. While extensive work has been done on why such perturbations occur or how to successfully defend against them, we still do not have a complete understanding of robustness. In this work, we investigate the connection between robustness and simplicity. We find that simpler classifiers, formed by reducing the number of output classes, are less susceptible to adversarial perturbations. Consequently, we demonstrate that decomposing a complex multiclass model into an aggregation of binary models enhances robustness. This behavior is consistent across different datasets and model architectures and can be combined with known defense techniques such as adversarial training. Moreover, we provide further evidence of a disconnect between standard and robust learning regimes. In particular, we show that elaborate label information can help standard accuracy but harm robustness.
A Conjoint Application of Data Mining Techniques for Analysis of Global Terrorist Attacks -- Prevention and Prediction for Combating Terrorism
Kumar, Vivek, Mazzara, Manuel, Gen., Maj., Messina, Angelo, Lee, JooYoung
Terrorism has become one of the most tedious problems to deal with and a prominent threat to mankind. To enhance counter-terrorism, several research works are developing efficient and precise systems, data mining is not an exception. Immense data is floating in our lives, though the scarce availability of authentic terrorist attack data in the public domain makes it complicated to fight terrorism. This manuscript focuses on data mining classification techniques and discusses the role of United Nations in counter-terrorism. It analyzes the performance of classifiers such as Lazy Tree, Multilayer Perceptron, Multiclass and Na\"ive Bayes classifiers for observing the trends for terrorist attacks around the world. The database for experiment purpose is created from different public and open access sources for years 1970-2015 comprising of 156,772 reported attacks causing massive losses of lives and property. This work enumerates the losses occurred, trends in attack frequency and places more prone to it, by considering the attack responsibilities taken as evaluation class.
Learning With Single-Teacher Multi-Student
You, Shan (Peking University) | Xu, Chang (University of Sydney) | Xu, Chao (Peking University) | Tao, Dacheng (University of Sydney)
In this paper we study a new learning problem defined as "Single-Teacher Multi-Student" (STMS) problem, which investigates how to learn a series of student (simple and specific) models from a single teacher (complex and universal) model. Taking the multiclass and binary classification for example, we focus on learning multiple binary classifiers from a single multiclass classifier, where each of binary classifier is responsible for a certain class. This actually derives from some realistic problems, such as identifying the suspect based on a comprehensive face recognition system. By treating the already-trained multiclass classifier as the teacher, and multiple binary classifiers as the students, we propose a gated support vector machine (gSVM) as a solution. A series of gSVMs are learned with the help of single teacher multiclass classifier. The teacher's help is two-fold; first, the teacher's score provides the gated values for students' decision; second, the teacher can guide the students to accommodate training examples with different difficulty degrees. Extensive experiments on real datasets validate its effectiveness.
Fast Reinforcement Learning with Large Action Sets using Error-Correcting Output Codes for MDP Factorization
Dulac-Arnold, Gabriel, Denoyer, Ludovic, Preux, Philippe, Gallinari, Patrick
The use of Reinforcement Learning in real-world scenarios is strongly limited by issues of scale. Most RL learning algorithms are unable to deal with problems composed of hundreds or sometimes even dozens of possible actions, and therefore cannot be applied to many real-world problems. We consider the RL problem in the supervised classification framework where the optimal policy is obtained through a multiclass classifier, the set of classes being the set of actions of the problem. We introduce error-correcting output codes (ECOCs) in this setting and propose two new methods for reducing complexity when using rollouts-based approaches. The first method consists in using an ECOC-based classifier as the multiclass classifier, reducing the learning complexity from O(A2) to O(Alog(A)). We then propose a novel method that profits from the ECOC's coding dictionary to split the initial MDP into O(log(A)) seperate two-action MDPs. This second method reduces learning complexity even further, from O(A2) to O(log(A)), thus rendering problems with large action sets tractable. We finish by experimentally demonstrating the advantages of our approach on a set of benchmark problems, both in speed and performance.
Modeling the Detection of Textual Cyberbullying
Dinakar, Karthik (Massachusetts Institute of Technology) | Reichart, Roi (Hebrew University of Jerusalem) | Lieberman, Henry (Massachusetts Institute of Technology)
The scourge of cyberbullying has assumed alarming proportions with an ever-increasing number of adolescents admitting to having dealt with it either as a victim or as a bystander. Anonymity and the lack of meaningful supervision in the electronic medium are two factors that have exacerbated this social menace. Comments or posts involving sensitive topics that are personal to an individual are more likely to be internalized by a victim, often resulting in tragic outcomes. We decompose the overall detection problem into detection of sensitive topics, lending itself into text classification sub-problems. We experiment with a corpus of 4500 YouTube comments, applying a range of binary and multiclass classifiers. We find that binary classifiers for individual labels outperform multiclass classifiers. Our findings show that the detection of textual cyberbullying can be tackled by building individual topic-sensitive classifiers.